Non-Inferiority Designs in A/B Testing
نویسنده
چکیده
Most, if not all the current statistical literature on online randomized controlled experiments, (commonly referred to as “A/B Tests”), focuses on superiority designs. That is, the error of the first kind is formulated as incorrectly rejecting a composite null hypothesis of the treatment having no effect or having a negative effect. It is then controlled via a statistical significance threshold or confidence intervals, or posterior probabilities, and credible intervals in Bayesian approaches. However, there is no reason to limit all A/B testing practice to tests for superiority. The current paper argues that there are many cases where testing for non-inferiority is both more appropriate and more powerful in the statistical sense, resulting in better decision-making, and in some cases: significantly faster tests. Non-inferiority tests are appropriate when one cares about the treatment being at least as good as the current solution, with “as good as” being defined by a specified noninferiority margin (sometimes referred to as “equivalence margin”). Certain non-inferiority designs can result in faster testing compared to a similar superiority test. The paper introduces two separate approaches for designing non-inferiority A/B tests: tests planned for a true difference of zero or more and tests planned for a 2 positive true difference. It provides several examples of applying both approaches to cases from conversion rate optimization. Sample size calculations are provided for both approaches and a comparison is made between them and between noninferiority and superiority tests. Finally, drawbacks specific to non-inferiority tests are discussed, with guidance on how to limit or control them in practice.
منابع مشابه
Optimal group-sequential designs for simultaneous testing of superiority and non-inferiority.
Confirmatory clinical trials comparing the efficacy of a new treatment with an active control typically aim at demonstrating either superiority or non-inferiority. In the latter case, the objective is to show that the experimental treatment is not worse than the active control by more than a pre-specified non-inferiority margin. We consider two classes of group-sequential designs that combine t...
متن کاملMeeting the demand for more sophisticated study designs. A proposal for a new type of clinical trial: the hybrid design
Background Treatment effect is traditionally assessed through either superiority or non-inferiority clinical trials. Investigators may find that because of safety concerns and/or wide variability across strata of the superiority margin of active controls over placebo, neither a superiority nor a non-inferiority trial design is ethical or practical in some disease populations. Prior knowledge ma...
متن کاملRobust designs in non-inferiority three arm clinical trials with presence of heteroscedasticity
In this paper, we describe an adjusted method to facilitate a non-inferiority trial by a three-arm robust design. Because local optimal designs derived in Hasler et al. [2007] require knowledge about the ratios of the population variances and are not necessarily robust with respect to possible misspecifications, a maximin approach is adopted. This method requires only the specification of an in...
متن کاملTesting superiority and non-inferiority hypotheses in active controlled clinical trials.
Switching between testing for superiority and non-inferiority has been an important statistical issue in the design and analysis of active controlled clinical trial. In practice, it is often conducted with a two-stage testing procedure. It has been assumed that there is no type I error rate adjustment required when either switching to test for non-inferiority once the data fail to support the s...
متن کاملDoes KRAS Testing in Metastatic Colorectal Cancer Impact Overall Survival? A Comparative Effectiveness Study in a Population-Based Sample
PURPOSE Epidermal growth factor receptor (EGFR) inhibitors are approved for treating metastatic colorectal cancer (CRC); KRAS mutation testing is recommended prior to treatment. We conducted a non-inferiority analysis to examine whether KRAS testing has impacted survival in CRC patients. PATIENTS AND METHODS We included 1186 metastatic CRC cases from seven health plans. A cutpoint of July, 20...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017